IKE - An Interactive Tool for Knowledge Extraction
نویسندگان
چکیده
Recent work on information extraction has suggested that fast, interactive tools can be highly effective; however, creating a usable system is challenging, and few publically available tools exist. In this paper we present IKE, a new extraction tool that performs fast, interactive bootstrapping to develop high-quality extraction patterns for targeted relations. Central to IKE is the notion that an extraction pattern can be treated as a search query over a corpus. To operationalize this, IKE uses a novel query language that is expressive, easy to understand, and fast to execute essential requirements for a practical system. It is also the first interactive extraction tool to seamlessly integrate symbolic (boolean) and distributional (similarity-based) methods for search. An initial evaluation suggests that relation tables can be populated substantially faster than by manual pattern authoring while retaining accuracy, and more reliably than fully automated tools, an important step towards practical KB construction. We are making IKE publically available (http://allenai.org/ software/interactive-knowledge-extraction).
منابع مشابه
A standard Interactive Multimedia eBook Generator Engine for e-Learning Process
Introduction: Using standard authoring tools is essential to promote E-Learning in teaching-learning process. Learning content in medical sciences often consists of multimedia elements. On the other hand, it is frequently required to revise and update the medical content. Hence, access to the authoring tools that can encompass multimedia elements and allow easy content revision is helpful in e-...
متن کاملExpressing high-throughput data in a restructurable, integrated form for knowledge extraction
The growth of high-throughput and combinatorial methods in experimental materials science has pushed human-mediated data processing traditions beyond their limit. In order for such tools to be useful, automated data processing must become an integral part of the scientific workflow. Here we report on components of our scientific data management system, OpenMat, and especially a core component, ...
متن کاملIke: an Interactive Klystron Evaluation Program for Slac Linear Collider Klystron Performance*
SLAC probably has accumulated more high power klystron operating hours in the delivery of beams for physics research than ani other laboratory in the world. Operating data has been logged by hand, and search of this data for klystron performance statistics is time consuming. When the new 65 MW klystrons for the SLC were planned, a computer based interlock and data recording system ias implement...
متن کاملUser-centric Knowledge Extraction and Quality Assurance
An ontology is a machine readable knowledge collection. There is an abundance of information available for human consumption. Thus, large general knowledge ontologies are typically generated tapping into this information source using imperfect automatic extraction approaches that translate human readable text into machine readable semantic knowledge. This thesis provides methods for user-driven...
متن کاملBuilding Concept Frames based on Text Corpora
Linguists have been using different kinds of frame representation since the emergence of the notion “frame”. The main goal of the annotation system described in this paper is to provide an interactive and easy-to-use tool for structuring concept-specific information in linguistic frames for discourse analysis or cultural studies. These frames take into account background or “world” knowledge as...
متن کامل